Lessons from Writing a Dissertation with Quarto
AMS Institute
January 29, 2026
Sounds familiar? 🤔
Our analyses live in fragments:
| Step | Tool | Format |
|---|---|---|
| Data cleaning | Excel/Python | .xlsx, .py |
| Statistics | Stata/R | .do, .R, .py |
| Figures | QGIS/matplotlib | .png, .jpeg, .pdf |
| Writing | Word/LaTeX | .docx, .tex |
| Presentation | PowerPoint | .pptx |
Move from fragmented to integrated, reproducible workflows
A consistent folder structure makes everything easier:
A repo per project
Have you experienced this?
Git tracks changes to your files over time.
Key concepts:
Track:
.R, .py, .qmd)README.md, .txt), configuration files …Not to track (.gitignore):
.xlsx, .shp, .db)Our analysis depends on specific package versions.
The problem:
# Create a new project (creates pyproject.toml)
uv init my-project
# Add dependencies (creates/updates uv.lock)
uv add pandas matplotlib statsmodels
# Run scripts (auto-creates venv, installs deps)
uv run python analysis.py
# Restore on another machine
uv sync # reads uv.lock, recreates exact environment
uv manages everything: venv, dependencies, lockfile.renv creates isolated, reproducible R environments.
Combine code, results, and narrative in one document.
Benefits:
An open-source scientific and technical publishing system.
Quarto Process Graphic: rdatatoolbox
---
title: "House prices dynamic in the Netherlands"
author: Eyayaw Beze
date: last-modified
format: pdf
---
```{r}
#| label: setup
library(data.table)
library(ggplot2)
# Constants
start_year = 2005
end_year = 2025 # max(data$year)
```
## Introduction
We analyze housing prices in the Netherlands from `{r} start_year` to `{r} end_year` period.
```{r}
#| label: import-data
data = fread("data/processed/prices.csv")
```
## Data
We use a novel data from CBS with `{r} nrow(unique(data))` unique transactions.
The summary statistics of the dataset is shown in @tbl-descriptive-stats.
In `{r} start_year` the average house price was `{r} data[year==start_year, mean(price_real)]`.
```{r}
#| label: tbl-descriptive-stats
desc_stats = data[, .(
min = min(price_real),
mean = mean(price_real),
median = quantile(price_real, 0.5),
q3 = quantile(price_real, 0.75),
max = max(price_real),
sd = var(price_real) ** 0.5
)]
knitr::kable(desc_stats)
```Brief demo of quarto features:
Trends of property values in the Netherlands across Municipalities using Waardering Onroerende Zaken (WOZ)
We’ll analysis housing price trends in the Netherlands using WOZ data.
File: ./demo/woz_analysis.qmd
You can organize by sections/chs
Starting simple + consistency
Common Pitfalls
C:\Users\...)Quarto
Git
| Practice | Tool |
|---|---|
| Directory structure | Cookiecutter |
| Version control | Git + GitHub |
| Pkg dependencies | uv/pip/renv |
| Literate programming | Quarto |
Thank you!
Contact: eyayaw.beze@ams-institute.org
Dissertation code:
Reproducible Research Workflows \(\cdotp\) AMS Institute \(\cdotp\) All-hands monthly